Version-Wide Software Birthmark via Machine Learning

نویسندگان

چکیده

Identifying the credibility of executable files is critical for security an operating system. Modern systems rely on code signing, which uses a default-valid trust model, to identify their publishers. A malware could pass software validation and by using counterfeit code-signing certificates. Although certificates can be revoked CAs, previous research showed that revocation delay takes as long 5.6 months. In this paper, we attempt with multiple-version without relying public key infrastructure (PKI), where new-version file usually developed incrementally based versions. The sharing features among different versions extracted identifying software. Accordingly, present software-birthmark scheme serve our purpose. Our generates cross-version birthmark same proposed binary-classification model machine learning algorithm imported exported function names from different-version files. To evaluate performance version-wide birthmarks, experiments include 138 Windows kernel32.dll 545 firefox.exe . We also use multiple algorithms comparisons. results show effectively derivations these used or suspicious

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Open Source Software Detection using Function-level Static Software Birthmark

As open-source software (OSS) is widely used, many IT organizations adopt OSS without obeying some guidelines for open-source license agreements. To reduce risks related to open-source licenses, the organizations should meet the requirements for OSS licenses. Because some OSS components may be given from major upstream suppliers in binary form, it is very hard to verify whether a binary program...

متن کامل

Machine Learning for Software Reuse

Recent work on learning apprentice systems suggests new approaches for using interactive programming environments to promote software reuse. Methodologies for software specification and validation yield natural domains of application for explanation-based learning techniques. This paper develops a relation between data abstractions in software and explanationbased generalization problems and sh...

متن کامل

Machine learning in genome-wide association studies.

Recently, genome-wide association studies have substantially expanded our knowledge about genetic variants that influence the susceptibility to complex diseases. Although standard statistical tests for each single-nucleotide polymorphism (SNP) separately are able to capture main genetic effects, different approaches are necessary to identify SNPs that influence disease risk jointly or in comple...

متن کامل

Gas Detection via Machine Learning

We present an Electronic Nose (ENose), which is aimed at identifying the presence of one out of two gases, possibly detecting the presence of a mixture of the two. Estimation of the concentrations of the components is also performed for a volatile organic compound (VOC) constituted by methanol and acetone, for the ranges 40-400 and 22-220 ppm (parts-per-million), respectively. Our system contai...

متن کامل

Machine Learning via Multiresolution Approximation

We consider the classification problem as a problem of approximation of a given training set. This approximation is constructed in a multiresolution framework, and organized in a tree-structure. It allows efficient training and query, both in constant time per training point. The proposed method is efficient for low-dimensional classification and regression estimation problems with large data s...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Access

سال: 2021

ISSN: ['2169-3536']

DOI: https://doi.org/10.1109/access.2021.3103186